39 research outputs found

    Morphological Analysis as Classification: an Inductive-Learning Approach

    Full text link
    Morphological analysis is an important subtask in text-to-speech conversion, hyphenation, and other language engineering tasks. The traditional approach to performing morphological analysis is to combine a morpheme lexicon, sets of (linguistic) rules, and heuristics to find a most probable analysis. In contrast we present an inductive learning approach in which morphological analysis is reformulated as a segmentation task. We report on a number of experiments in which five inductive learning algorithms are applied to three variations of the task of morphological analysis. Results show (i) that the generalisation performance of the algorithms is good, and (ii) that the lazy learning algorithm IB1-IG performs best on all three tasks. We conclude that lazy learning of morphological analysis as a classification task is indeed a viable approach; moreover, it has the strong advantages over the traditional approach of avoiding the knowledge-acquisition bottleneck, being fast and deterministic in learning and processing, and being language-independent.Comment: 11 pages, 5 encapsulated postscript figures, uses non-standard NeMLaP proceedings style nemlap.sty; inputs ipamacs (international phonetic alphabet) and epsf macro

    Positiewoordenboek van de Nederlandse taal /

    No full text

    Discovering Rules with a Genetic Sequential Covering Algorithm (GeSeCo)

    No full text
    Lists of if-then rules (i.e. ordered rule sets) are among the most expressive and intelligible representations for inductive learning algorithms. Two extreme strategies searching for such list of rules can be distinguished (i) local strategies primarily based on a step by step search for the optimal list of rules, and (ii) global strategies primarily based on a one strike search for the optimal list of rules. Both approaches have their disadvantages. In this paper we present an intermediate strategy. A sequential covering strategy is combined with a one-strike genetic search for the most promising next rule. To achieve this, a new rule-fitness function is introduced. Experimental results are reported in which the performance of our intermediate approach is compared to other rule learning algorithms. 1 Introduction Inductive learning typically involves a search through a large hypothesis space to find the hypothesis that covers the training data and that generalizes to unobs..

    An Optimization Framework for Process Discovery Algorithms

    No full text
    Today there are many process mining techniques that, based on an event log, allow for the automatic induction of a process model. The process mining algorithms that are able to deal with incomplete event logs, exceptions, and noise typically have many parameters to tune the algorithm. Therefore, the user needs to select the right parameter setting using a trail-and-error approach. So far, there is no general method available to search for an optimal parameter setting. One of the problems is the lack of negative examples and the omission of a standard measure for the quality of mined process models. Therefore, the so-called k-fold-cv experimental set up as used in the machine learning community cannot be applied directly. This paper describes an adapted version of the k-fold-cv set-up so that it can be used in the context of process mining. Illustrative experimental results of applying this method in combination with the HeuristicsMiner process mining algorithm and three different performance measurements are presented. Using the k-fold-cv experimental set-up and an event log with low frequent behavior and noise, it appears possible to find the optimal parameters setting. Another important result is that the simple combination of yes/no parsing of a trace in combination with negative examples based on noise is sufficient for parameter optimization. This makes the framework universal applicable for benchmarking of different process mining algorithms with different process model representation languages

    Stretching the Limits of Learning Without Modules

    No full text
    Decomposing a hard problem into easier sub-problems (`modularisation') is a powerful problemsolving technique. Modularisation is often based on expert knowledge and can lead to efficient high-performance models. Contrasting with this expert-based approach is the approach of machine-learning algorithms such as back-propagation and symbolic inductive-learning algorithms that do not make us of a predetermined modular architecture. We present examples of machine-learned models without modules of problems that are traditionally solved by expertbased modularisation. The machine-learned models perform equally good as or better than the expert-based models. This surprising fact gives rise to the question whether the performance of machine-learned models could be further increased when modularisation is somehow incorporated in the learning algorithms. We describe work in progress on the development of machine learning algorithms that automatically construct modular architectures during learning..
    corecore